-
Notifications
You must be signed in to change notification settings - Fork 271
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Shelved] Add flushed_roots to dirty_store #4304
base: master
Are you sure you want to change the base?
Conversation
@@ -6295,33 +6295,42 @@ impl AccountsDb { | |||
}); | |||
|
|||
// Always flush up to `requested_flush_root`, which is necessary for things like snapshotting. | |||
let cached_roots: BTreeSet<Slot> = self.accounts_cache.clear_roots(requested_flush_root); | |||
let flushed_roots: BTreeSet<Slot> = self.accounts_cache.clear_roots(requested_flush_root); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rename: cached_roots -> flushed_roots for clarity.
// Only add to the uncleaned roots set *after* we've flushed the previous roots, | ||
// so that clean will actually be able to clean the slots. | ||
let num_new_roots = cached_roots.len(); | ||
// Regardless of whether this slot was *just* flushed from the cache by |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An optimization to move the "max_flush_root" update out of the loop.
We only need to fetch_max with the max value in the "flushed_roots".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intuitively this looks right, but I wonder if there was a reason to not originally do it this way... Maybe worth looking at the git history to see?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code was introduced 4 years ago, and has never changed since them. @carllin
solana-labs@8f7c7fb#diff-1090394420d51617f3233275c2b65ed706b35b53b115fe65f82c682af8134a6fR3017
@carllin Is there any reason that we can't move set_max_flush_root
out of the loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this looks correct to me.
// Only add to the uncleaned roots set *after* we've flushed the previous roots, | ||
// so that clean will actually be able to clean the slots. | ||
let num_new_roots = cached_roots.len(); | ||
// Regardless of whether this slot was *just* flushed from the cache by |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intuitively this looks right, but I wonder if there was a reason to not originally do it this way... Maybe worth looking at the git history to see?
// the cache is overwhelmed, we flushed some yet to be rooted frozen slots | ||
// These slots may then *later* be marked as root, so we still need to handle updating the | ||
// `max_flush_root` in the accounts cache. | ||
if let Some(storage) = self.storage.get_slot_storage_entry(root) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agave/accounts-db/src/accounts_db.rs
Line 7564 in 8db563d
self.uncleaned_pubkeys.insert(slot, dirty_keys); |
These slots has already been added to uncleaned_pubkeys when calculating accounts_delta_hash.
When clean runs, it will include all these dirty pubkeys in the candidates.
agave/accounts-db/src/accounts_db.rs
Line 2477 in 8db563d
self.uncleaned_pubkeys.remove(&uncleaned_slot) |
Therefore, I think we don't need to add them to dirty store at flush. It seems redundant and is a waste of time to scan those stores again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated the code to only added to dirty_stores when the slot is not found in uncleaned_pubkeys, and a stat to count how many time we added them.
I think it should be zero. I have started a node the verify this. We will know soon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha, mystery solved, thank you.
With the lt hash, we'll eventually remove the accounts delta hash calculation. At that point we'll need to fix this then.
Would it make sense to not add to uncleaned_pubkeys when calculating the accounts delta hash, and instead only add the storage to dirty_stores during flush?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think we can do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The experimental run on mainnet shows that the stats are indeed zeros. This means that we don't need to add them to dirty storage at flush time.
I will shelve this PR, and start two new other PRs for the obsolete comment change and the fetch_max optimization change.
Problem
When we flush slots from accounts cache, we should add them to "dirty_store" for clean to pick them up.
Summary of Changes
Fixes #